we use the SRPBS Traveling Subject MRI Dataset (here)
a traveling cohort : 9 healthy subjects travel to 12 sites to be imaged
of the 12 sites, 9 have consistently available T1w and rsfMRI in 6 subjects
sites represent variability in both MRI manufacturer and MRI model (high variability)
this enables us to investigate the reliability of our imaging-derived phenotypes (IDPs)
IDPs computed with ANTsPyMM (latest version)
we use the intraclass correlation coefficient (ICC) to assess consistency or reproducibility of the quantitative IDPs
the majority of IDPs show superior reliability
ANTsPyMM IDPs derived from the same subjects imaged at different sites with MRI from various manufacturers show overall high reliability. This provides empirical evidence that multiple modality MRI may be used to derive quantitative phenotypes on which predictive models may be based.
see this paper 10.1016/j.jcm.2016.02.012 for discussion of ICC
Cicchetti (1994) gives the following often quoted guidelines for interpretation for kappa or ICC inter-rater agreement measures:
A different guideline is given by Koo and Li (2016):
see this analysis of freesurfer on T1w showing values from 0.81 - 0.88
T1Hier_resnetGrade is a deep learning based method that
accurately predicts image quality in multi-site data.
Values range from 0 ( unusable ) to 3 ( best ) quality.
Demonstrate effect of subject vs effect of Scanner or Site
## Warning: Removed 1 rows containing missing values (`geom_point()`).
## Warning: Removed 1 rows containing missing values (`geom_point()`).
What is the variability of the measurement if we control for age alone?
mdl=(lm( outcome ~ age + Subject , data=dd ))
visreg::visreg( mdl, "Subject", gg=TRUE ) +
scale_color_brewer(palette = brewpal )
What is the reproducibility of the measurement if we control for age and scanner?
mdl=(lm( outcome ~ age + Subject + Scanner + 1 , data=dd ))
visreg::visreg( mdl, "Subject", gg=TRUE ) +
scale_color_brewer(palette = brewpal )
# grid.arrange( grobs = visreg::visreg( mdl, gg=TRUE ), ncol=1, main='reproducibility' )
What is the reproducibility of the measurement if we control for site?
mdl=(lm( outcome ~ age + Subject + Site + 1 , data=dd ))
visreg::visreg( mdl, "Subject", gg=TRUE ) +
scale_color_brewer(palette = brewpal )
# grid.arrange( grobs = visreg::visreg( mdl, gg=TRUE ), ncol=1, main='reproducibility' )
What is the effect of age if we control for the best confounds?
library(lme4)
mdl=(lm( outcome ~ age + Scanner + T1Hier_resnetGrade +(Subject), data=dd ))
# mdl=(lm( outcome ~ age +(Subject), data=dd ))
visreg::visreg( mdl, 'age', gg=TRUE, main="t=-7.4 vs t=-5.6" ) +
scale_color_brewer(palette = brewpal )
What is the reproducibility of the measurement if we control for motion?
mdl=(lm( outcome ~ age + Subject + Scanner + 1 , data=dd ))
grid.arrange( grobs = visreg::visreg( mdl, gg=TRUE ), ncol=1, main='reproducibility' )
ICC(2,k) (Two-Way Random, Absolute Agreement, Average Measures): This version is used when the raters (or measurement devices) are considered random samples from a larger population of raters and you want to generalize your findings to this broader context. It’s appropriate when different imaging sites might use different equipment or personnel, and you wish to assess the reliability of measurements across these variable conditions.
Both of these ICC types use the ‘k’ form, which means that the reliability is assessed based on the average of multiple measurements (in this case, multiple imaging sessions or sites), which generally provides a more robust and stable estimate of reliability.
Summmary reliability data by Scanner
print( reli_stats( "outcome", "Scanner", "Subject", data=dd ) )
##
## Coefficient of Variation (%): 13.1
## Standard Error of Measurement (SEM): 0.0278
## Standard Error of the Estimate (SEE): 0.0377
## Standard Error of Prediction (SEP): 0.0758
##
## Intraclass Correlation Coefficients with 95 % C.I.
## Model Measures Type ICC Lower CI Upper CI
## 1 one-way random Agreement ICC1 0.3138 0.06440 0.6978
## 2 two-way random Agreement ICC2 0.3181 0.07220 0.6987
## 3 two-way fixed Consistency ICC3 0.3282 0.06934 0.7097
## 4 one-way random Avg. Agreement ICC1k 0.6957 0.25603 0.9203
## 5 two-way random Avg. Agreement ICC2k 0.6999 0.28009 0.9206
## 6 two-way fixed Avg. Consistency ICC3k 0.7095 0.27142 0.9244
Summmary reliability data by Site
print( reli_stats( "outcome", "Site", "Subject", data=dd ) )
##
## Coefficient of Variation (%): 13
## Standard Error of Measurement (SEM): 0.0276
## Standard Error of the Estimate (SEE): 0.0507
## Standard Error of Prediction (SEP): 0.102
##
## Intraclass Correlation Coefficients with 95 % C.I.
## Model Measures Type ICC Lower CI Upper CI
## 1 one-way random Agreement ICC1 0.3177 0.1255 0.6720
## 2 two-way random Agreement ICC2 0.3202 0.1296 0.6726
## 3 two-way fixed Consistency ICC3 0.3311 0.1328 0.6842
## 4 one-way random Avg. Agreement ICC1k 0.8074 0.5636 0.9486
## 5 two-way random Avg. Agreement ICC2k 0.8092 0.5726 0.9487
## 6 two-way fixed Avg. Consistency ICC3k 0.8167 0.5794 0.9512
Site-wise reliability: pairwise comparison rsfMRI_DefaultMode_2_DefaultMode
Summmary reliability data by Site
##
## Coefficient of Variation (%): 5.51
## Standard Error of Measurement (SEM): 141
## Standard Error of the Estimate (SEE): 387
## Standard Error of Prediction (SEP): 591
##
## Intraclass Correlation Coefficients with 95 % C.I.
## Model Measures Type ICC Lower CI Upper CI
## 1 one-way random Agreement ICC1 0.7070 0.5008 0.9033
## 2 two-way random Agreement ICC2 0.7090 0.5017 0.9039
## 3 two-way fixed Consistency ICC3 0.7550 0.5614 0.9224
## 4 one-way random Avg. Agreement ICC1k 0.9560 0.9003 0.9882
## 5 two-way random Avg. Agreement ICC2k 0.9564 0.9006 0.9883
## 6 two-way fixed Avg. Consistency ICC3k 0.9652 0.9201 0.9907
##
## [1] "T1Hier_vol_r_ivcerebellum"
##
## Coefficient of Variation (%): 4.06
## Standard Error of Measurement (SEM): 0.0352
## Standard Error of the Estimate (SEE): 0.119
## Standard Error of Prediction (SEP): 0.182
##
## Intraclass Correlation Coefficients with 95 % C.I.
## Model Measures Type ICC Lower CI Upper CI
## 1 one-way random Agreement ICC1 0.4272 0.2108 0.7560
## 2 two-way random Agreement ICC2 0.4521 0.2152 0.7699
## 3 two-way fixed Consistency ICC3 0.7425 0.5442 0.9176
## 4 one-way random Avg. Agreement ICC1k 0.8704 0.7063 0.9654
## 5 two-way random Avg. Agreement ICC2k 0.8813 0.7116 0.9679
## 6 two-way fixed Avg. Consistency ICC3k 0.9629 0.9148 0.9901
##
## [1] "T1Hier_thk_LRAVG_transverse_temporaldktcortex"
## boundary (singular) fit: see help('isSingular')
##
## Coefficient of Variation (%): -70.4
## Standard Error of Measurement (SEM): 0.071
## Standard Error of the Estimate (SEE): 0
## Standard Error of Prediction (SEP): 0.243
##
## Intraclass Correlation Coefficients with 95 % C.I.
## Model Measures Type ICC Lower CI Upper CI
## 1 one-way random Agreement ICC1 -0.01453 -0.07341 0.2009
## 2 two-way random Agreement ICC2 0.00000 -0.05776 0.2106
## 3 two-way fixed Consistency ICC3 0.00000 -0.06688 0.2345
## 4 one-way random Avg. Agreement ICC1k -0.14799 -1.60085 0.6934
## 5 two-way random Avg. Agreement ICC2k 0.00000 -0.96643 0.7060
## 6 two-way fixed Avg. Consistency ICC3k 0.00000 -1.29460 0.7338
##
## [1] "rsfMRI_DorsalAttention_2_Subcortical"
##
## Coefficient of Variation (%): 11.1
## Standard Error of Measurement (SEM): 0.041
## Standard Error of the Estimate (SEE): 0.0999
## Standard Error of Prediction (SEP): 0.174
##
## Intraclass Correlation Coefficients with 95 % C.I.
## Model Measures Type ICC Lower CI Upper CI
## 1 one-way random Agreement ICC1 0.3701 0.1647 0.7147
## 2 two-way random Agreement ICC2 0.3848 0.1828 0.7198
## 3 two-way fixed Consistency ICC3 0.4876 0.2603 0.7951
## 4 one-way random Avg. Agreement ICC1k 0.8409 0.6396 0.9575
## 5 two-way random Avg. Agreement ICC2k 0.8492 0.6681 0.9585
## 6 two-way fixed Avg. Consistency ICC3k 0.8954 0.7601 0.9722
##
## [1] "rsfMRI_DorsalAttention_2_DorsalAttention"
##
## Coefficient of Variation (%): 5.77
## Standard Error of Measurement (SEM): 46.4
## Standard Error of the Estimate (SEE): 115
## Standard Error of Prediction (SEP): 196
##
## Intraclass Correlation Coefficients with 95 % C.I.
## Model Measures Type ICC Lower CI Upper CI
## 1 one-way random Agreement ICC1 0.4195 0.2044 0.7507
## 2 two-way random Agreement ICC2 0.4313 0.2187 0.7546
## 3 two-way fixed Consistency ICC3 0.5279 0.2981 0.8184
## 4 one-way random Avg. Agreement ICC1k 0.8667 0.6981 0.9644
## 5 two-way random Avg. Agreement ICC2k 0.8722 0.7158 0.9651
## 6 two-way fixed Avg. Consistency ICC3k 0.9096 0.7926 0.9759
##
## [1] "T1Hier_vol_r_iiicerebellum"
Site-wise reliability
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## anat icc
## 23 vol_LRAVG_precentraldktcortex 0.7049038
## 176 Frontoparietal_2_DorsalAttention 0.8271261
## 13 vol_LRAVG_medial_orbitofrontaldktcortex 0.9019878
## 40 thk_LRAVG_caudal_anterior_cingulatedktcortex 0.9306049
## 75 thk_mtg_rn_LRAVGdeep_cit168 0.9362787
## 89 vol_l_crus_icerebellum 0.9598178
higher scores are better
a few sites are lower quality
a few subjects exhibit lower quality (consistently)
train simlr on each of top quality sites
test on each other site
this lets us look at ICC in the simlr space
## [1] "USE SITE: COI"
## [1] "USE SITE: HKH"
## Warning in sqrt(ICC3 * (1 - ICC3)): NaNs produced
## [1] "USE SITE: SWA"
## [1] "USE SITE: ATV"
## [1] "USE SITE: ATT"
## [1] "USE SITE: YC2"
## [1] "T-test of SiMLR vs ROI representation: T1 IDPs"
##
## Welch Two Sample t-test
##
## data: myiccsimlr[t1sel2, "icc"] and myiccs[t1sel1, "icc"]
## t = 3.261, df = 155.72, p-value = 0.001364
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.01772330 0.07218347
## sample estimates:
## mean of x mean of y
## 0.9412915 0.8963382
## [1] "T-test of SiMLR vs ROI representation: rsfMRI IDPs"
##
## Welch Two Sample t-test
##
## data: myiccsimlr[rssel2, "icc"] and myiccs[rssel1, "icc"]
## t = 2.1292, df = 82.764, p-value = 0.03621
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.005827113 0.171188380
## sample estimates:
## mean of x mean of y
## 0.7523306 0.6638229
rsfMRI measurements are inter and intra network connectivity between the canonical functional networks
default mode
salience
frontoparietal task control
…